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(54) VOICE ENCODER ADOPTING ANALYSYS SYNTHESIS TECHNIQUE BASED ON PULSE 
EXCITATION 

(57)Abstract: 

PURPOSE: To use a voice encoder, where an 
analysis-syntnesis technique is adopted, with a satisfactory 
performance. 

CONSTITUTION: In an analysis-synthesis encoder, an 
original voice signal is shifted by a short time so that it is 
Hit consistent with an expected signal, which is encoded with a 
replica made by a long-time synthesis filter, with respect to 
time. This shift is determined in each subfi-ame by through 
search in the range of such possible values that energy of an 
error signal is minimized. Once the optimum shift is 
determined, an optimum excitation is searched. This 
excitation has a very small number of pulses arranged in a 
decision theoretic constitution and is selected in a code book 
including words all of which are obtained from a limited 
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number of keywords. The decision theoretic constitution reahzes quick search of the optimum 
excitation without storing the code book neither actually executing the synthesizing filter operation 
of candidate excitations. 



CLAIMS 



[Claim(s)] 

[Claim l] a coding step " the next actuation, i.e., •-, several [ sampled the original sound signal in 
the 1st rate of a sampling and the 1st was beforehand decided to be for each in the sequence of a 
sample [x (n)] produced as a result ] " the sample of Ls - or the sample which two or more blocks 
containing the integral multiple of said 1st number follow dividing 

• In order to decide one group's linearity preliminary announcement multipUer (aO used to linearity 
preUminary announcement ********^ short-term composition ********^ and spectrum weighting 
********, short-term analysis is performed about the original sound signal. The display of said 
multiplier in the frequency domain is generated, and it inserts in the signal which it related 
[ signal ] to the value of said display and had effective information (j (phOl encoded over a period 
equal to the period of the sample of the block with which 1 block or one group continues.^ 

- It lets said linearity preliminary announcement ******** pass, and the short-term residual signal 
[rs (n)] over the sample of said block or a block of a group is acquired.* 

- In order to determine a solution parameter over a long period of time containing the long-term 
composition **** delay d and a multiplier b, long-term analysis is perform about said residual 
signal [rs (n)], and it inserts in the signal which had effective information (j (d), j (b)] encode over 
time amount equal to the period of the sample of the block with which 1 block or one group 
continues in relation to the value of said parameter.; 

• Each consists of an amplitude contribution (excitation gain) and a gestalt contribution 
(renovation). The latter consists of a limited number far smaller than the sample of said 1st number 
of pulses. The location and ampUtude which were specified beforehand belong to the set of each 
finite. About the excitation signal chosen within the one-set excitation signal, long-term 
composition ********^ The sound signal sample of each block of the schedule encoded as it is also 
with the sound signal [yw (n)] which is acquired by performing short-term composition ******** and 
spectrum weighting ********, and by which weighting was reconfigurated and carried out is 
reproduced.; 

- The time amount shift of the one-set sample of said residual signal [rs (n)] is carried out by the 
discrete step, and it sets there. The residual signal sample of each set In order to aUgn the 
reconfigurated residual signal [ss (n)] which is acquired as a result of short-term composition 
******** about an excitation signal, and its residual signal in time amount It has a number equal to 
the measurement size in one block of the sound signal sample which should be encoded of samples, 
correction sound signal [xw by which the shiffc there generated the corrected residual signal [rm (n)] 
which is exposed to the same short-term composition ******** as what was performed to said 
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excitation signal, and spectrum weighting ******** ^ was reconfigurated by this, and weighting was 
carried out (n) - ] •- making --; 

- It is reconfigurated with the correcting signal [xw (n)] by which weighting was reconfigurated and 
carried out. The optimal excitation signal over the sample of each block is determined by making 
into min energy of the error signal [e (n)] by which weighting was carried out expressed according to 
the difference between the signals lyw (n)] by which weighting was carried out. And actuation 
inserted in the signal which had the information (j (s), and \j (gmax), j (gnor), sigma]) which 
identifies the optimal excitation signal encoded is included. A sound signal is set to the approach of 
encoding / decrypting, and it is Said renovation pulse is only a sample which is not the zero of the 
word which consisted of samples of said 1st number Ls. * The word of the group to whom, as for the 
renovation word over the excitation signal of the 1st subset, the 1st set was limited including one 
pair of pulses Two pulses are the keywords put on the key position decided beforehand. Other words 
in the subset those pulses -- 1 time " one location - the edge of a word going - one of the pulses of 
these " said edge - or until it arrives at the key position of other pulses in the start word By 
shifting to coincidence, it is obtained firom each of the keyword, the direction of the shift is the same 
to all words, and they arel and *. The renovation word over the excitation signal of the 2nd subset 
Include only one pulse firom which the location differs to each signal, and said decision oC and the 
optimal excitation signal is received. It is direct calculated by using pulse response [ of the filter 
with which the energy of said error signal by which weighting was carried out performs the 
composition about the excitation signal, and spectrum weighting ******** ] Q (n). In the count, it is 
the next actuation, i.e., Said pulse response Q (n) to each of the possible pulse position in an 
excitation signal and its energy Eq are determined.; 

- The energy of the 1st partial error signal [el (n)] and same error signal expressed with the 
difference between the contributions lyw 1 (n)] of a signal [xw (n)] and excitation signal **** 
memory by which weighting was reconfigurated and carried out is determined.; 

- The 1st correlation R (el q) between pulse response Q (n) to each of said 1st partial error signal [el 
(n)] and the pulse of an excitation signal is determined.; 

- To each excitation signal, said pulse response is left and the signal [u (n)] showing the contribution 
^f*^****** y^y ^YiQ initial condition of the zero of the excitation signal is determined.; 

- The 2nd correlation R (el u) between said signal [u (n)] and the 1st partial error signal [el (n)] 
which express the contribution of ******** by the initial condition of the zero of an excitation signal 
to energy [ of said signal [u (n)] ] E (u) showing the contribution of ******** by the initial condition 
of the zero of an excitation signal and a list is determined.; 

- The optimum value of the amplitude contribution is determined to each excitation signal as a ratio 
between said 2nd correlation and the energy of a signal produced firom ******** in the initial 
condition of zero.; 

- Approach characterized by including the actuation which calculates the value of the error signal 
energy over each excitation signal as a function of said 2nd correlation R of said energy E of said 
energy Eu of the signal showing the contribution of ******** by the initial condition of the zero of 
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excitation and its 1st partial error signal (el) (el u). 

[Claim 2] said pulse - single - the approach of claim 1 characterized by having the amplitude of 1. 
[Claim 3] The sequence of a sound signal sample is constituted by the reverse frame which the 
plurality to which each supports one of said the blocks follows. And it is divided to the frame 
containing a number Lf of samples with which the 2nd was decided beforehand, said short-term 
analysis is performed to each frame, and said short-term analysis in a frame is received. The die 
length is Lf+P (the number of the linearity preliminary announcement multipliers in P= each 
group). The sample aperture also containing the sample of several H+K with which included a 
present frame and the continuing present frame, and said continuing frame was beforehand 
decided to be is analyzed. Said aperture It is the approach of claim 1 characterized by being the 
trapezoidal shape aperture which sets aside the 1st and the last P sample, and carries out 
weighting of all the samples to it being also at the greatest weight, and determining the weighting 
factor to it through the linear interpolation between the minimum weight and the greatest weight. 
[Claim 4] The linearity preliminary announcement multiplier ai is the approach of claim 3 which is 
the multiplier obtained as a result of interpolation between the value given in the short-term 
analysis to the present frame, and the value given to a front frame, and is characterized by 
performing interpolation there by operating about said display to the reverse frame in early stages 
of each frame. 

[Claim 5] Said Hnearity preliminary announcement residual signal is the approach of any one 
pubHcation in the above-mentioned claim which receives ******** of reduction in advance of 
long-term analysis, and is characterized by giving the residual signal [rf (n)] ****(ed) by that cause. 
[Claim 6] The sequence of a sound signal sample is divided to the frame containing a number Lf of 
samples with which each consisted of two or more reverse frames corresponding to one of said 
blocks to follow, and the 2nd was decided beforehand. In order to perform said long-term analysis to 
each frame and to determine a solution parameter over a long period of time in the first half present 
" a frame - continuing -- a frame -- including - and " said " continuing - a frame " beforehand " 
specifying - having had " several - H - + - K " a sample " containing " **** - ****(ing) " having 
had " the remainder " a signal - [ -- rf -- (■- n --) - ] -- a sample an aperture " analyzing - having 

— things - the description ** " carrying out " a claim - one -* an approach . 

[Claim 7] It be the approach of claim 6 characterize by determine the gain in each frame including 
the actuation as which said long-term analysis be determine in the long-term preliminary 
announcement gain G showing the ratio between the energy of the ****(ed) residual signal in the 
input of a means to perform said analysis to each term further , and the output from this means . 
[Claim 8] said long-term analysis -- further -- the next actuation, i.e., •-, uttering the sound signal 
corresponding to a frame depending on the value of said long-term analysis multiplier b, and the 
preliminary announcement gain G or -- the case where classified and it is classified as whether it 
is uttered as that by which the segment is uttered -- the 1st flag (V) -- generating 

- The multiplier b relevant to the value and the present frame of the long-term analysis delay d is 
compared with the thing relevant to a front frame. Fewer than the amount as which delay 
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fluctuation was specified beforehand, when the multiplier value in both frames is forward The 
approach of claims 6 or 7 characterized by including the actuation which generates the 2nd flag (F) 
which enables interpolation between the delay and the multiplier value which were calculated to 
the frame of said front, and the thing calculated to the present frame. 

[Claim 9] The long-term analysis delay d is determined as max of the automatic correlation function 
of the ****(ed) residual signal within the aperture used to the analysis itself. Before determining 
the preUminary announcement gain G over the long-term analysis multiplier b and the present 
frame, and the maximum of said automatic correlation function If said 1st and 2nd flags are 
generated with the frame of said front, it will be determined even [ near the maximum of the same 
ftmction in a front frame ] . And said maximum An approach given in any of claims 6-8 which will be 
characterized by being used as delay to the present frame if only amounts smaller than the value as 
which it was beforehand specified from the max in the aperture relevant to the present frame differ 
they are. 

[Claim 10] The value of the long-term analysis multiplier b be the 1st maximum bl coordinated 
with the ratio between the energy of the ****(ed) residual signal in the frame before [ in the present 
frame ] it reach and the die length be in spacing equal to the long-term analysis delay . Approach 
given in any of claims 6*9 characterize by clip they be . 

[Claim 11] For the value of the long-term analysis multipUer b, while the preliminary 
announcement gain G is below the gain threshold Gthr it is the 2nd maximum b2. If it exceeds, it 
will be the 2nd maximum b2. Approach given in any of claims 6-10 characterized by clipping they 
are. 

[Claim 12] the long-term analysis delay d and said interpolation of a multiplier b be the residual 
signal ss which be the linear interpolation which extend over all frames, and be reconfigurated in 
the case of the interpolate [ nonintegral ] delay value, it be the approach of claim 8 characterize by 
estimate that the secondary polynomial interpolation by which the core be established in the 
perimeter of the integer delay value nearest to said interpolate value be also for the value of the 
sample to which (n) correspond. 

[Claim 13] The information relevant to the long-term analysis multiplier b inserted in the encoded 
signal is an index showing the quantized multipUer value. And a multiplier value smaller than the 
fraction as which the value by which the information relevant to the long-term analysis delay d 
expressed the delay value on the outside of spacing of the permitted delay, and min was quantized 
was specified beforehand is set to 0. An approach given in any of claims 6" 12 characterized by being 
inserted in the signal with which the index showing the delay information showing the value on the 
outside of said spacing of the permitted delay and the value by which said min was quantized was 
encoded when made 0 they are. 

[Claim 14] in order to determine the optimal excitation, as for the excitation signal of said 2nd 
subset, said 1st flag (V) was generated or the approach of claims 1 and 8 which wiU be 
characterized by the thing the analysis of the energy distribution in the residual signal which was 
used according to whether said flag was generated and was corrected indicates energy 
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concentration in a short time to be, and for which it shows the standup of an utterance sound if it 
becomes. 

[Claim 15] It is the approach of claim 14 characterized by normalizing the excitation signal of two 
subsets as it is also at a different normalization factor coordinated with the number of the pulses, in 
each subset signal in order to determine the optimal excitation. 

[Claim 16] It is said 1st flag (V). An amplitude contribution of as opposed to [ if generated ] the 
excitation signal of said 2nd subset is the approach of claim 14 characterized by being restricted so 
that the threshold proportional to the absolute value of the residual signal may not be exceeded. 
[Claim 17] Said analysis of the energy distribution of said corrected residual signal is performed in 
each reverse frame. (And the next actuation, i.e., •-, Said reverse frame is divided into two or more 
apertures which overlapped partially, and the aperture of the beginning and the last supports each 
first stage or last section of the reverse frame. There) the aperture following the first thing shifts 
only one sample about each and the aperture before that " having -- coming - **** - further ■ the 
energy of the corrected residual signal in all reverse frames, power, and the energy in one each of 
said the apertures - determining "I 

' The power to the aperture the energy of whose is max is determined, the ratio between the power 
in said aperture and the power in a reverse frame is determined, and they are>* and The approach 
of claim 14 characterized by including the actuation said energy concentration will be recognized to 
be in said maximum energy and said power ratio if said maximum energy and said ratio are below 
each threshold there as compared with each threshold. 

[Claim 18] An approach given in any of claims 6-17 characterized by being restricted to the 
maximum as which only the amount to which the long-term analysis delay d is proportional to the 
whole shift accumulated by the front frame was changed, and the **** value of the fluctuation was 
specified beforehand if the 2nd flag (F) is generated they are. 

[Claim 19] Thereby, fluctuation of said delay is the approach of claim 18 which changes the decision 
about interpolation and is characterized by being made an invalid if made as [ come / out of the 
value of spacing which was able to opt for delay beforehand ], 

[Claim 20] If at least one of said the 1st and 2nd flags is generated, said residual signal If the 
analysis of the corrected residual signal energy in a reverse frame shows that the sound signal 
segment which receives said time amount shift in a reverse frame, and corresponds is not silence, 
and includes the pitch peak All shifts in a frame since the shift relevant to a reverse frame is 
accumulated with the thing of the reverse frame in front of the same frame are approaches given in 
any of claim 1 characterized by remaining below in the maximum shift, and claims 6-19 they are. 
[Claim 21] Said analysis of the corrected residual signal energy is the next actuation, i.e., >. When 
it reaches in the energy itself, it compares with the energy threshold which shows that a sound 
signal segment is not silence.,* 

- The corrected residual [ in / it reaches and / spacing with the die length equal to long-term analysis 
delay ] signal power in a reverse frame and the ratio between these power are determined, and they 
are>* and The approach of claim 20 characterized by include the actuation in comparison with the 
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power threshold which shows existence of the pitch peak in a reverse frame when this ratio is 
exceeded. 

[Claim 22] It is the approach of claims 20 or 21 which the shift to a reverse frame is determined 
within spacing which extends around the shift accumulated by the reverse frame in front of the 
same frame before determining the optimal excitation signal, and are characterized by it being a 
value which makes min energy of said first partial error signal [el (n)]. 

[Claim 23] A shift [ in / in order to determine said shift, the rise sampling of said residual signal is 
performed in the 2nd rate which is the multiple of the 1st rate, and / a reverse frame ] is the 
approach of claim 20 characterized by being equal to the sample beyond one or it of a residual signal 
by which the rise sampling was carried out. 

[Claim 24] The signal showing the correction residual signal ****(ed) as said 1st partial error signal 
is also at the initial condition of zero [xw2 (n)]. And it is calculated as the sum between the 2nd 
partial error signal [eO (n)] which is a difference between the memory contribution [xwl (n)] of 
correction residual signal ********^ and the memory contribution \ywl (n)] of excitation ********. 
The signal [xw2 (n)] showing the correction residual signal ****(ed) as it is also at the initial 
condition of the zero relevant to the sample in a reverse frame As opposed to each of the remaining 
shift [ in / it is obtained by performing actual **** of the corrected residual signal to the shift value 
between the upper limit of the spacing, and the mean value between two extremal value, and / 
another side and its spacing ] It is the approach of claims 22 or 23 characterized by being repeated 
from said pulse response from the value relevant to a front sample. 

[Claim 25] The decision of said spacing of a shift value is the next actuation, i.e., >. Two symmetry 
values about the accumulated value are defined to the spacing edge.; 

- The residual signal peak location in the residual signal by which the rise samphng was carried out 
is determined, and it is compared with the peak location in a front reverse frame.; 

- The escape of the spacing [ be / by the consequent duplicate or loss of a residual signal peak / it / in 
order to avoid too much shift of the reverse frame in the past and/or the future ] on one side of the 
accumulated value or both sides is restricted. The approach of claim 24 characterized by performing 
through actuation. 

[Claim 26] In the spacing limit only in one side of the accumulated value, the search to the shift is 
the approach of claim 25 characterized by performing in consideration of a certain fixed number 
beyond the spacing edge which is unrelated to said limit so that equally to the number of values 
with which the number of the examined value on the whole is contained between said symmetry 
values. 

[Claim 27] The information (j (phi), j (d), j (b), j (s), and Ij (gnor), j (gmax), sigma]) about a solution 
parameter and an excitation signal is left a linear preliminary announcement multiplier display 
and over a long period of time. Said display is reconfigurated and the preliminary announcement 
multiplier of the reconfigurated linearity is obtained from it. A solution parameter is reconfigurated 
over a long period of time, and an excitation signal is chosen in the one set excitation signal 
corresponding to what was used at the coding step. And since said signal generates the sound signal 
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sample [y (n)] of the reconfigurated block to each excitation signal [s (n)] By using the preliminary 
announcement multiplier ai of the reconfigurated linearity, the long-term analysis delay d, and a 
multiplier b Each block of the reconfigurated sound signal [y (n)] including the decryption step of 
receiving the same short period as what was performed at the coding step, and long-term 
composition ******** Preliminary announcement multiplier ai of the reconfigurated linearity which 
is obtained in the initial part of the shelf- life of a linearity preliminary announcement multiplier as 
a result of interpolation between the reconstruction value relevant to a very last shelf- life, and the 
reconstruction value relevant to the present period It is generated by performing short-term 
composition ******** to depend. If it is below the amount with which the value of the long-term 
analysis delay d and the value of a multiplier b relevant to two continuing shelf-lives were 
compared, and the delay fluctuation was beforehand decided to be and the multiplier is forward in 
both periods An approach given in any of claims 1-26 characterized by generating the flag 
corresponding to the 2nd flag, and enabling interpolation between solution parameter values in 
long-term composition ******** over a long period of time relevant to said two shelf- lives they are. 
[Claim 28] It is equipment for using an analysis-composition technique, and encoding / decrypting a 
sound signal, and the decoder of a there is •-. Means (MT) of samphng a sound signal at the 1st rate, 
and dividing the sample sequence to the block which consists of a sample of the 1st number^ 

- One or linearity preliminary announcement multiplier ai of one group to the sample of a block 
beyond it It calculates. Said multiplier is changed into the display in the frequency domain. From 
said display The index j (phi) which identifies the multiplier itself which should be inserted in the 
encoded signal is obtained, and it has a short-term analysis means (STA, STRl) for leaving said 
index and reconfigurating a multiplier. There each group's linearity preliminary announcement 
multiplier - one or the time amount period equal to the period of the sample of a block beyond it -- 
crossing effective "l -- further - The block of a signal sample is received from said sampling means 
(MT), and it is the linearity preliminary announcement multiplier ai. It receives fi-om said 
short-term analysis means (STA, STRl), and is the short-term preliminary announcement residual 
signal rs. Linear prehminary announcement filter which generates (n) (LPC); 

- The parameter for long-term composition ******** containing delay (d) and a multiplier (b) is 
obtained firom said residual signal. It has a long-term analysis means (LTA, LTRl) for changing said 
parameter into the index \j (b), j (d)] of the schedule inserted in the encoded signal. And there a 
long-term solution parameter - one or the time amount period equal to the period of the sample of a 
block beyond it - crossing " effective --; " further - The long-term composition filter which receives 
said parameter fi^m a long-term analysis means (LTA, LTRl) (LTSl), Said short-term analysis 
means (STA, STRl) to said linearity preliminary announcement multiplier ai The short-term 
composition filter (STSl) and spectrum weighting filter (SW) to receive are included with a series of 
configurations. The signal belonging to the one -set excitation signal with which each includes the 
gestalt contribution from which the number consisted of a number far smaller than said 1st number 
of pulses, the amplitude specified beforehand, and locations is received. And signal yw 
reconfigurated to one each of the excitation signal of the 1st **** system which generates (n) (LTSl, 
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STSi, sw); 

- Reconfigurated residual signal ss which is generated by the long-term composition filter (LTSl) of 
said 1st **** system It is the one-set sample yw of said residual signal in order to make it align in 
(n) and time amount, (n) By the step to disperse, it has a means (TS) for carrying out a time amount 
shift. There The sample in the set of a residual signal has a number equal to the sample of said 1st 
number of samples. It is chosen within spacing of the permitted value and each shift step is'y and 
also -. A series of the same short-term composition filters and spectrum weighting filters are 
included in the thing (STSI, SW) of said 1st **** system. The correction residual signal generated 
by the time amount shift means against each of the value of said spacing is supphed. It has the 2nd 
**** system (STS*, SW*) which generates the correction residual signal by which weighting was 
reconfigurated and carried out. Said 1st and 2nd **** systems (LTSl, STSI, SWl, STS', SWO the 
contribution showing the memory of front ********^ and the contribution showing ******** by the 
initial condition of zero " separate determining - further - By generating the error signal [e 
(n)] by which weighting was carried out by comparing the signal generated by said 1st and 2nd **** 
systems, and minimizing the energy of said error signal by which weighting was carried out the 
means (SM, EM) for identifying the optimal excitation signal and the optimal shift, and inserting in 
the signal which had the information which identifies the optimal excitation signal encoded " 
having - and " further -- a coding side - Said index is left. Means for reconfigurating a solution 
parameter a linearity preliminary announcement multiplier and over a long period of time (LTR2, 
STR2); 

- In the set corresponding to the set which is the coding side and was used for the thing (LTSl, 
STSl) of said 1st **** system including a series of the same long-term composition filters and 
short-term composition filters Let the information relevant to the optimal excitation pass, and the 
selected excitation signal is ****(ed). the equipment with which the 3rd **** system (LTS2, STS2) 
which generates the reconfigurated 1-block sound signal sample is formed - setting -■ > a 
renovation pulse " several [ said / 1st ] " the sample which is not the zero of the word which 
consisted of samples of Ls - it is "J 

- The word of the group to whom, as for the renovation word over the excitation signal of the 1st 
subset, the 1st set was limited including one pair of pulses Two pulses are the keywords put on the 
key position decided beforehand, and other words in the subset the edge of a word " going - at once 
" one location - one of the pulses of these " said edge - or until it arrives at the key position of 
other pulses in the start word It is obtained firom each of the keyword by shifting these pulses to 
coincidence. To all words, it is the same and the shift direction isl and The renovation word over 
the excitation signal of the 2nd subset Include only one pulse firom which the location differs to each 
signal, and it sets for; and said error signal generating means (SM, EM). The means which makes 
error energy min consists of processing units, and this processing unit is •-. Said pulse response [Q 
(n)] to one each and its energy [Eq] of the possible pulse position in an excitation signal are 
determined.; 

- The correcting signal [xw (n)] by which weighting was reconfigurated and carried out, the 1st 
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partial error signal [el (n)] expressed according to the difference between the contributions (ywl 
(n)] of excitation signal **** memory, and the energy of the error signal itself are determined.? 

- The 1st correlation [R (el q)] between the pulse responses to each of the pulse of said 1st partial 
error signal [el (n)] and excitation signal is determined.; 

- To each excitation signal, said pulse response is left and the signal [u (n)l showing the contribution 
of ******** by the initial condition of the zero of the excitation signal is determined.; 

• The 2nd correlation R (el u) between said signal [u (n)] and the 1st partial error signal [el (n)] 
showing the contribution of ******** of being also at the initial condition of the energy [E (u)] of 
said signal [u (n)] showing the contribution of ******** of being also at the initial condition of the 
zero of an excitation signal, and the zero of an excitation signal is determined^ 

- It determines as a ratio between the energy of the signal produced from ******** of being also at 
said 2nd correlation and the initial condition of zero about the optimum value of an amplitude 
contribution to each excitation signal.; 

- Equipment characterized by being arranged so that the error signal energy value over each 
excitation signal may be calculated as a function of said 2nd correlation R of said energy [E (el)] of 
said energy (Eu) of the signal showing the contribution of ******** of being also at the initial 
condition of the zero of the excitation, and said 1st partial error signal (el u). 

[Claim 29] Equipment of claim 28 characterized by preparing the low-pass filter (FPB) between said 
linearity preliminary announcement filter (LPC) and said long-term analysis means (LTA, LTRl). 
[Claim 30] The means (STR2) for reconfigurating a linearity preliminary announcement multipher 
in the short-term analysis means (STA, STRl) and decoder in an encoder The means for performing 
linear interpolation between the values relevant to two continuing shelf- Uves is included about said 
display in the firequency domain. And equipment of claims 28 or 29 characterized by supplying the 
interpolated value in the initial PERT of a shelf- life with a multiplier of one set to the short-term 
composition filter (STSl, STS\ STS2) of said **** system. 

[Claim 31] The means (LTR2) for reconfigurating a solution parameter over a long period of time in 
the long-term analysis means (LTA, LTRl) and decoder in an encoder Compare the parameter 
relevant to two continuing shelf-lives, and the comparison means for generating the flag (F) which 
makes it possible to perform interpolation between these parameters when they fulfill the 
conditions decided beforehand is included. The long-term composition filter (LTSl, LTS2) of said 1st 
and 2nd **** systems When there is said flag, secondary polynomial interpolation of said parameter 
extended to all the shelf-lives is performed. And the encoder of any one publication of claim 28*30 
characterized by a means to supply the parameter interpolated at each long-terna composition filter 
(LTSl, LTS2) being interlocked with. 

[Claim 32] The circuit where said time amount shift means (TS) carries out the rise sampling of the 
residual signal (US), The residual signal sample by which the rise sampling of the 1st group 
corresponding to the sample of said 1st number LS was carried out to each block of the sample 
which should be encoded, It is placed before and after said 1st group, respectively, and the residual 
signal sample with which the rise sampling of two another groups containing many samples 
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coordinated with the shift by which max was permitted was carried out is memorized. The 
command by the energy minimization means (EM) is faced. And to the 2nd **** system (STS', 
STWO The 1st group*s thing and the sample of the same number are included. By and said optimal 
shift The encoder of any one publication of claim 28-31 characterized by including the storage 
means (SH) for supplying the residual signal sample which was shifted about the 1st group, and by 
which the rise sampling of the 4th group was carried out. 



DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Industrial Application] Preferably, the above-mentioned encoder can use this invention by the good 
engine performance for being concerned with the voice coder which adopted the 
analysis-composition technique, and specifying further, for example, it is concerned with the 
encoder which fits the low bit rate application in the minimum of the speed range in 4 - 8 K bit [ /] 
within the limits s. There is a voice coder of the schedule used to the so-called high-speed channel of 
the Europe mobile radio method as an example of application of this form. In the encoder which 
uses the analysis-composition technique over each block of the sound signal sample encoded, the 
excitation signal over the synthetic filter which simulates voice generation equipment is chosen 
within 1 set of excitation signals so that significant distorted magnitude which may be perceived 
may be made into min. Generally this is obtained through the comparison of the sample which the 
original signal in a suitable filter with the function in consideration of the ability of human being's 
consciousness to evaluate [ how ] consequent distortion compounded, and a corresponding sample, 
and instantaneous weighting. In the most general gestalt, the synthetic filter here contains two 
concatenated components which impose spectrum nature on an excitation signal a short period and 
over a long period of time, respectively. The former is coordinated with correlation between the 
continuing samples, and the spectrum envelope which is not flat is generated, and the latter is 
coordinated with the correlation during the contiguity pitch period for which the fine signal 
spectrum configuration depends on it. In the case of this method, the signal encoded includes the 
information relevant to excitation, a short-term composition parameter (a short-term hnearity 
preliminary announcement multiplier or other amounts relevant to them), and a long-term 
composition parameter (long-term delay and linearity preliminary announcement multiplier). 
[0002] Although especially insertion of the long-period-of-time nature to the signal encoded if the 
delay was updated in each reverse frame in the inside of an analysis-composition cycle greatly 
improves natural sound of a signal, the related information needs most bits required for coding. It is 
important to discover the solution which enables reduction of the amount of the information on the 
schedule transmitted to a decoder, preserving the quality of a signal especially in low bit rate 
appUcation. 23-March 26, 1992, San Francisco (U.S., California), In the paper 1337 of the name 
"the application to analysis-composition coding and the pitch preliminary announcement which 
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were generalized" by W.B. Crane, R.P. Lama Chan Derain, and P. clone which were shown at the 
ICASSP92 meeting It has suggested performing interpolation by the long-term analysis delay 
updated with each frame for this purpose. The direct interpolation in those without a suitable array 
makes a time irregular train cause between the long-term spectral characteristics in an original 
signal and the compounded original signal, and generates a remarkable distortion while it gives the 
delay value which is not an optimum value. In order to avoid such un-arranging, in the 
above-mentioned paper, the long-term prehminary announcement machine parameter became the 
known function of time amount, and it has suggested correcting the original signal so that direct 
interpolation to which the engine performance is not reduced may be enabled. The suggested 
correction consists of time amount vibration and the small amplitude scaling which were Umited 
about the original signal. Time amount vibration is performed in a discrete mode. The need for 
setting up the optimal amount, in order to insert such a time amount vibration therefore increases 
the complexity of an encoder clearly. 

[0003] Therefore, in order to solve this problem, in this invention, before long-term analysis, the 
time amount shift according to individual is introduced on that residual signal, and the coding 
system performed so that the search to the optimal excitation signal and the optimal shift may 
mitigate the complexity of count is used. The description of this invention is indicated in the 
attached claim. The desirable example of this invention will be described with reference to an 
attachment **** drawing. In an accompanying drawing, • drawing 1 is the block diagram of an 
encoder,,* drawing 2 is the functional diagram of a certain block of an encoder, andi and drawing 3 
are the block diagrams of a decoder. Before describing an encoder / decoder structure in a detail, it 
summarizes a basic principle, several [ which the encoder was encoded and was fixed ] *• the sample 
x (n) of the sound signal of a schedule packed into the block (generally called a "frame") containing 
the sample which Lf follows is received- Each frame of Lf sample is divided after that to the reverse 
frame of the continuation **** sample of Ls. An encoder will determine the one-set parameter 
which should be transmitted to the decoder so that the signal with which a decoder approximates 
the original signal can be compounded. In order to attain this, an analysis*synthesis procedure is 
used, it lets it pass, an encoder analyzes the effect of the possible value of each parameter, and the 
value which makes it possible to obtain the best approximation about the original signal is chosen. 
For this reason, the encoder contains the replica of that decoder, in order to make a corresponding 
output signal to each of said value. Since such an output signal is generated, both the ** term of a 
sound signal and short-term correlation are used, and it is imposed on an excitation signal through 
each synthetic filter. In each frame, an encoder performs Knearity preliminary announcement 
analysis (a short period or LPC analysis), and calculates the short-term residual signal used in 
order to calculate the parameter (delay and multipher) of a long-term composition filter. (Since the 
primary filter is used, that multipher is peculiar in this desirable example) . In order to improve the 
resolution of long-term correlation information, delay of a frame present in the delay and multiplier 
both and a front frame is interpolated in a value, when near. In order to decrease the efiect of the 
time mismatching between the signal of the origin of it, and the reconfigurated signal, in each 
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reverse &ame, a small time shift is introduced into the original sound signal, and it is determined 
through a thorough search in the range of a possible value that the shift amount will make min the 
error, i.e., the energy of an error (difference between the signal of a dimension and the signal 
reconfigurated). The search to the optimal excitation signal is performed after determining the 
optimal shift. 

[0004] In order to make a publication still clearer, in the following explanation the possible 
excitation signal It is treated as a word chosen with the definite **** sign document for the encoder 
of the form known as CELP (sign document excitation linearity preUminary announcement). This It 
is **** even noting that it consists of a very small number accompanied by the amplitude and 
location where each word was defined beforehand deterministicaUy temporarily of pulses 
(preferably 1 or 2) and does not have a sign document. The encoded signal includes in usual the 
short period and long-term composition filter parameter which are transmitted in the gestalt of the 
index encoded suitably, and the information relevant to the optimal excitation. In a decoder, such an 
index is left, in order to give the signal which the excitation signal corresponding to what was used 
by the encoder was searched, and was reconfigurated, it **** in a series of long-term composition 
filters and short-term composition filters, and the reconfigurated signal will receive much more 
******** (ex post facto ****) based on a short-term composition parameter, in order to improve a 
subjective signal quality still more for example. Then, the reconfigurated signal is again changed 
into an analog gestalt, and is supplied to use equipment. As an example, by the following 
pubhcations, the examination to the frame of the sample (in the sampling frequency of 8kHz, it 
corresponds to the sound signal segment the die length of whose is T= 20ms) of die-length Lf=160 is 
made, and, as for the fi-ame, die length Ls is divided to eight reverse fi-ames of 20 samples. It is 
required to use the HOC sample (for example, H= 24, K= 8) as a group of the following fi*ames for the 
reason relevant to installation of a time amount shift in addition to Lf sample of a fi'ame. 
[0005] For referring to drawing 1 , it memorizes temporarily, and for every Tms, it writes in and 
reads to the buffer MT arranged so that the input signal sample x on Rhine 1 (n) might memorize 
the sample of N=Lf+H+K, and Lf sample as a block is taken out. The sample read in MT is supplied 
to high-pass-filter FPA which excepts a direct-current drift and a low firequency noise, and the 
****(ed) signal Xf (n) is supplied to the short-term analysis circuit STA and the linearity 
preliminary announcement filter LPC. To each firame, Circuit STA determines P linearity 
preliminary announcement multiplier ai (for example, 10) of one set, changes such a multiplier into 
the parameter of one group in the firequency domain generally known as LSP (Rhine spectrum pair), 
and performs the quantization 1 of the difference between adjoining parameters, for example, a 
scaler. The index j (phi) which is a part of encoded signal is transmitted to a decoder through Rhine 
2a, after binary encoding in the circuit which is not illustrated. Since the spectrum line has the 
property of quantization of synthetic filter stability better than the thing of a multipUer, 
interpolation, and a check as everyone knows, the conversion to the Rhine spectrum pair is 
desirable, smoothing of the spectrum information relevant to a characteristic frequency region by 
the block STA before calculating the Rhine spectrum pair - it a quantization circuit also in 
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order to adjust resolution, it performs. This is the calculated multiplier ai. Each factor gamma 1 i It 
is attained by multiplying, and as a criterion, although the value is smaller than 1, it is close to 
about 1. Especially, in the case of a narrow characteristic frequency region, although this actuation 
is equally narrow, the danger of reproducing after quantizing the characteristic frequency region 
shifted about the original thing is decreased, and, so, the cause over degradation of the quality of 
the encoded signal is decreased. 

[0006] Circuit STA follows "digital signal processing about a sound signal" by L.R. RABINA and R.W. 
Shaffer (cousin-hole Ed., en gel wood KURIFFU, N.J., USA, 1978), and the conventional automatic 
correlation technique which is described by 401 pages, and is a multipUer ai. It calculates. STA is an 
one-set Lf+P input sample (especially) obtained through the trapezoidal shape aperture which 
carries out weighting to it being also at the greatest weight (especially l) about all the samples 
except P sample of the beginning and the last to count. It operates on the sample which occupies the 
Lf+P location of the last in MT. It is determined that the weight to it is also for the easy Unear 
interpolation actuation between the minimum weight and the maximum weight, and smoothing 
needed for giving a good result to this appearance with an automatic correlation technique is 
restricted to the duplication field between the continuing apertures. Moreover, when positioning in 
the forward direction of the aperture encodes the first reverse frame (three [ for example, ] of the 
beginnings) of a frame instead of the preUminary announcement multiplier of the linearity 
calculated to the frame itself. The fact that the multiplier obtained by conversion of the Rhine 
spectrum pair value determined through interpolation between the value relevant to a front frame 
and the value relevant to the present frame is used is taken into consideration. This has guaranteed 
the ****** shift between the parameter of the present frame, and the parameter of a front frame. It 
is voice and the IEEE minutes about signal processing, and a paper of "count of the Rhine spectrum 
frequency which used the CHIEBISHIEBU polynomial", and conversion of the Unearity preliminary 
announcement multiplier to the Rhine spectrum pair will be performed in December, 1986 in 
acoustical and the approach proposed by P. KABARU and R.P. Lama Chan Derain, for example. 
Probably, detailed description will be unnecessary since actuation of STA is a standard linearity 
preUminary announcement encoder. 

[0007] Moreover, Index j (phO is supplied to the linearity preliminary announcement multiplier 
reconstruction circuit STR 1, and supplies the value quantized about the multiplier obtained by 
applying a reverse procedure about what was used for a circuit STR 1 changing the multiplier into 
the Rhine spectrum pair at Filter LPC, the short-term composition filter STS 1, STS' and the 
spectrum weighting filter SW, and SW, Furthermore, STRl also calculates the interpolated value 
which should be used by the first three reverse frames. The value quantized in the following 
pubhcations is ai because of simplification. It is specified. Filter LPC receives ****(ed) sound signal 
sample xf (n), and is ordinary function 1-A (z) (however, [Equation l]) about them. 
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It is alike, it follows and **** and is the short-term preUminary announcement residual signal rs. 
(n) is generated and it is the residual signal rs. (n) is the ****(ed) residual signal rf. The low pass 
filter FPB which makes (n), and corrected residual signal rm The time amount shift circuit TS 
which makes (n) is supplied. Low-pass ******** makes easy actuation of the following long-term 
analysis circuits LTAas everyone knows. Circuit LTAis determined in each frame and supplies, 
Gain b, i.e., the multipher, by which weighting of the sample is carried out to the delay d (pitch 
period) used since the signal with which the sample of an excitation signal was reconfigurated by 
the following long-term composition filter LTSl is generated in the first half. 

[0008] as the presetting number which took the die length of the aperture into consideration for the 
count which makes it possible to change Block LTA between the minimum values and maximums 
which were allowed k by Delay d (for example, 20 and 120), and to acquire a satisfactory value [ as 
opposed to d for x ] -* the following automatic correlation function, i.e., - [Equation 2] 



Delay d is calculated by making it max. As it already said that it takes into consideration that a 
sample with the newest aperture must be included, the die length is compromise between two 
opposite requirements, and its precision of the evaluation is also so large that die length is large. On 
the other hand, the aperture is enabled to acquire the present value made to adjoin the edge which 
interpolation takes it, since a short ******** shorter paddle approaches the edge of a firame at which 
the core should be encoded (Lf sample). For example, x is set to K. In a desirable example, the delay 
never becomes below the die length of a reverse frame, but this greatly simplifies the continuing 
actuation. Moreover, it is corrected, the value calculated by the formula (l) is investigated after that, 
it guarantees the smooth configuration over d as it can do, and it amends the synchronous loss by 
time amount shift, the value of a multiplier b -* a degree type, i.e., ', -- [Equation 3] 
rl (n) =rf (n)-b-rf (n-d) (2) 

Error signal r 1 in the output of LTSl which is alike and is given more Making energy of (n) into min 
is determined. For b, E (r0 is a degree type to the delay value d used to the present firame.; 
[Equation 4] 

E (r, ) = "5!"%, ^ (n) (3) 

It is given by formula b=R[rf (d)] / E (rO noting that it comes out and the energy which becomes 
settled is shown. To the value of b, the minimum value 0 and maximum 1 are set up, respectively. 
Although it is excepted since the reverse signal it forces that zero or less value transmits a notation 
bit is supported, one or more values make a filter unstable as everyone knows. The value of b 
calculated using the formula (2) is corrected so that the best quahty of the encoded signal may be 
guaranteed. Furthermore, it is possible to use the value acquired by the linear interpolation 
between the value calculated to the fi:ont firame with a certain frame instead of the values d and b 
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calculated by the formula (l) and (2) and the value calculated to the present frame. 
[0009] With count of d and b, the preliminary announcement gain G was calculated, and it is an 
amount showing the ratio between the input from a long-term preliminary announcement machine, 
and the energy of an output signal, and has given the magnitude of long-term preliminary 
announcement effectiveness. Gain G is a formula [several 5]. 

1 

G= — 

1 -bR (rr (d) ) /E' (ff ) 

it is specified be alike having " an upper type - -[Equation 6] 

E' (rr ) = e' r, ' (n) (4) 

For Gain G, it is used for deciding whether to be that the voice segment currently encoded is 
pronounced, and it is each threshold Gthr. It is shown by the larger value of G and b than bthr. In 
the case of the sound uttered, LTA performs interpolation and generates the flag V used in order to 
determine to introduce a time amount shift. If the first correction to Delay d is due to the search to 
the maximum of a fiinction [ / near / predetermined / the value acquired with a front frame (for 
example, **15%) ] (l) and the maximums of a there differ from maximums only with a main amount 
smaller than a certain limit, the new value which gives a convenient and much more smooth 
appearance to interpolation will be used. This secondary search is performed, only when the signal 
in a front frame is pronounced strongly and interpolation is received. Furthermore, the correction in 
the case of being carried out is made, before calculating b and G so that the value already corrected 
about d may be used to such count. 

[0010] The 2nd correction is adjustable delay [outside l] by which the effect is compared with the 
thing of asynchronous operation of an encoder. 

K 

It is coordinated with existence of the time amount shift device to insert. In order to recover 
synchronicity, it is calculated by LTA and the value of d corrected as said before is coordinated with 
the amount of the shift [ itself], and it is [Equation 7]. 
d ' =K d/r L f 

It is changed by adding correction term d' given as be alike to it, it sets at a front ceremony, and is 

[External Character 2]. 

h 

It is the shift accumulated to the frame with which it is expressed as the number of the samples of 
the residual signal raised and sampled by ****** gamma, and d and Lf have the same semantics as 
having stated above. A rise sampling will be further discussed by the detail with reference to Circuit 
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TS. Correction needs interpolation for the present frame, and when a voice segment is not uttered, 
it is performed, the first conditions do not have interpolation - if it becomes, since a shift will not be 
performed *- required - moreover, this situation - setting -- an exact value " Seki since the 
minimum uniform correction of d is usually perceived the bottom, the above-mentioned signal is not 
uttered. The absolute value is restricted to maximum | d' | max called 1 before adding a correction 
term to d. Furthermore, the correction does not correct the decision (described also later) about 
interpolation, but only when not taking besides the value of the range which was able to give the 
value of d, it is performed- If the first correction has too large b about b, since increase of too much 
energy will arise and a noise will be produced, it is the 1st upper limit bl. It consists of clipping b. 
limit bl it coordinates with the ratio between the energy in the pitch period of the frame of the 
present frame and a front - having - it - a degree type "Ihl =[E "(r6 0/E" (xd'l] d/2Lf it gives - 
having -* an upper type ** setting E " (xO - amount [several 8] 

E Ff ' (n) 

M- I - d 

******(ing), it is the energy in the pitch period d, and an index 0 and -1 express the present frame 
and the front frame, respectively. Correction is made when the energy in a front frame exceeds a 
certain threshold. 

[00 11] If the voice segment in which the value of G has low (below Gthr) and low periodicity is 
shown, b will be performed when comparatively large (larger than the 2nd limit b2), and an actual 
value is adopted in this case, since an artifact will be made to the encoded signal, much more limit 
to b is a value b2. It is adopted. When talking about interpolation, as for this, two relative 
fluctuation of continuing inter-frame d does not exceed the amount (for example, 15%) decided 
beforehand as an absolute value, but when both the values of b in such a frame are forward, it 
performs. Actual count of the value of d and b which are used in interpolation is performed in the 
long-term composition filter LTSl with which LTA sends Flag F, when the above-mentioned 
conditions are collated. Moreover, the same flag is supplied to the circuit EM which determines the 
optimal time amount shift and excitation. Although the information about interpolation is needed 
by the synthetic filter in a decoder, since it can remake immediately in the filter, it is not necessary 
like [ in an encoder ] to transmit it exactly by the comparison between the values of d and b relevant 
to two frames. The value of d and b which are determined with each frame is the information 
relevant to the long-term analysis of the schedule inserted in the encoded signal like [ in before ], 
and is changed into each index j (d) transmitted to a decoder through Rhine 2b and 2c after suitable 
coding, and j (b). Index j (b) is determined through quantization actuation, in the meantime, 
maximum is restricted to 1 and also the value of b below one half of a value where the beginning 
was quantized is set to 0. However, since d is already discrete ****** ^ the quantization about d is 
not required and it is desirable to transmit d with the gestalt of an index for [ with other 
information ] homogeneity, conversion on the index of the value of d actually the range of a 
possible value - value dmin from instead, it consists of those shifts that begin from 1. In the 
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described example (10 1 values of d and j (d)), 7 bits is required to encode index j (d), and such a bit 
enables coding of the value of j (d) in the outside of the predetermined range. It is used for it being 
shown that one of such the much more values (for example, value 127) sets b to 0, and does not 
contribute to the signal with which the long-term composition filter was reconfigurated if it was b= 
0, but since delay information is unnecessary, a decoder is supplied instead of index j (d) 
corresponding to the actual value of d. However, index j (b) corresponding to the minimum value of 
b is transmitted besides the information which forces 0 into b. 

[0012] The circuit which generates index j (b) and j (d) for simplification is included in Block LTA. 
Since Circuit LTA can make a decision relevant to the property and need for a sound of performing 
interpolation, i.e., a shift, only depending on the value with which b was corrected, notice the 
correction about d in consideration of a possible shift about the point performed after correction of b. 
The actuation performed by LTA is described by the detail with the appendix including the program 
written by C. Since it is written, an engineer is satisfactory at the point of designing the equipment 
which performs the described fimction. It is got blocked and the reconstruction circuit LTR 1 
containing the easy read only memory addressed by the index reconverts index j (d) and j (b) to the 
reconfigurated value for which each parameter was quantized. This reconstruction working and 
LTRl give the actual value of d and b, when the value (that is, j (d) is in the range of I'lOl) as which 
j (d) considered delay is shown. They are the value [ as opposed to / if any one of the values (the 
value is 102 to 127) out of range allowed j (d) is shown / b in LTRl ] 0, and the value dmin over d. It 
gives. When reconfigurating a parameter, as for the fact that all indexes j (d) which is not what does 
not correspond to the value in consideration of that delay, and is actually used for this purpose 
forces 0 into b and of being interpreted as a display, even the case of the error by the least 
significant bit of that index makes it possible to reconfigurate a value b= 0. Anyhow, by chance, if 
reconstruction of b= 0 goes wrong, since the circuit LTR 1 has index j (b) to which they correspond 
on the occasion of the use, the minimum value of b is generated. The value reconfigurated below for 
simplification (that is, quantization) is shown as b and d. 

[0013] The long-term composition filter LTSl follows ordinary l/[ of functions ] P(z) =l/l-b-z-d, and 
is the excitation signal si. Short-term residual signal ss reconfigurated by ****(ing) (n) (n) is 
generated. This consists of the gestalt information (renovation) expressed using the amplitude 
parameter g (renovation gain) of forward [ which was chosen with the sign document of the 
renovation gain IG 1 by one of a word andthe s (n) of the renovation sign document IC 1 ], or zero, 
and the notation information the value is indicated to be with the parameter sigma (renovation 
notation) which is **1. Therefore, signal si (n) is si. It is given by (n) = sigma-g-s (n) =gl and s (n), 
and is obtained through a multiplier Ml. Parameter sigma shall be read in the sign document IG 1 
for simplification. In order to make an understanding easy, as stated above that the sign documents 
ICl and IGl were represented as a circuit block (the memory containing them is suggested), the 
specific structure of a renovation sign document makes those storage an extraneous article 
temporarily. The structure of renovation and a gain sign document is examined later. 
Reconfigurated residual signal ss Weighting must be carried out to it being also at Factor b about 
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the sample relevant to [ in order to obtain the sample of (n) / at the moment ] n-d in LTSl. When 
interpolation is not performed, actuation of LTSl is completely the same as the conventional thing, 
the case of interpolation the value of d and b - n=0 - as Lf-1, deltad= [dO-d (-l)]/Lf, and deltab= 
[bO-b (-l)]/Lf a degree type, i.e., [Equation 9] 
d (n) =d (-1) + (n+1) Ad 

(5) 

b (n) =b (-1) + (n+1) Ab 

It is alike, and it is calculated for every sample by following. Notations dO and bO show the value 
relevant to the present frame, and d (-1) and b (-1) show the value relevant to a front frame. 
Therefore, interpolation becomes linearitylike and extends over all frames. The value of d (n) and b 
(n) changes for every sample. It is the signal [ in / d / (n) / generally / in not an integer but this / 
moment n-d (n) in continuation **** time amount ] ss. The value of (n) is not in agreement with the 
thing of the sample which can actually be used, and means what must be evaluated, according to 
this invention, this evaluation is performed through the secondary polynomial interpolation which 
has a core at the moment of the discrete **** time amount nearest to n-d (n) (namely, a parabola -- 
letting it pass), and interpolated value b (n) is multiplied by the value evaluated as it thinks best. 
Count with the interpolation procedure adopted is farther [ than the advanced interpolation based 
on ****(ing) a signal ] easy. However, the effectiveness is substantially the same as a low-pass filter, 
and since it avoids having the periodicity which the reconfigurated signal ******** too much, it is 
useful for good actuation of an encoder. 

[0014] Reconfigurated short-term residual signal ss (n) is suppUed to the short term composition 
filter STS 1 with the transfer function of 1 / I'A (z). This filter generates reconfigurated sound 
signal y (n) which is supplied to the spectrum weighting filter SW with the transfer fiinction of 
[l-A(z)]/[l-Aw(z)] as usual. Setting at a front ceremony, Aw (z) is a function [several 10]. 

E a wi • z 
is 1 

Coming out, awi=ai gammai and gamma are correction factors which determine the band which 
extends near a characteristic frequency region and which are determined experimentally. Signal yw 
by which weighting was reconfigurated and carried out (n) is the correcting signal xw which was 
acquired by ****(ing) the output signal from TS in respectively same filter STS[ of two 
concatenation **** ] ', and SW to STSl and SW and by which weighting was reconfigurated and 
carried out. In Adder SM, it is deducted from (n). With the output of SM, error signal e (n) by which 
weighting was carried out is obtained, and the signal is supplied to the error energy minimization 
circuit EM which performs all required actuation, in order to determine the optimal shift and 
excitation. 

[0015] The purpose of Circuit TS is aligning the signal of the schedule encoded as the replica which 
a long-term composition filter's makes is also in time amount, and is avoiding the shift between the 
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pitch peaks in the signal especially announced beforehand by LTSl and the original signal. For this 
reason, only a certain decision ******deltah will shift the time window of Ls sample at which TS in 
each reverse frame positions that reverse frame itself. It is determined by Unit EM that the shift of 
the schedule applied is also by the high-speed search procedure in within the limits of the value 
specified by the shift which can permit max. Although a shift with a residual signal is applied since 
a consequent distortion becomes what continuing ******** in STS' and SW smooths and cannot be 
substantially sensed by it, a shift with the original signal is not applied. The shift applied by the 
reverse frame is algebraically added to what was accumulated by the time amount, and in order to 
avoid too rapid fluctuation, it gives a shift on the whole. A shift on the whole cannot exceed a certain 
maximum (H sample of the signal of a dimension). Therefore, the reason of to why H sample of the 
frame which follows is loaded by MT is clear. The purpose which restricts shift fluctuation is 
avoiding too much distortion, and, so, the delay which must be permitted in a coding procedure opts 
for the limit relevant to a shift on the whole by the availability of a ftiture sample. The time amount 
shift has resolution smaller than one sampling period of the original signal, and it is required to 
perform the rise sampling of a residual signal so. In consideration of these all, Circuit TS is the 
residual signal by which the rise sampling was carried out at the output. 
[External Character 3] 
r s ( n ) 

****** rise sampling circuit US tin fact interpolation filter) and a shift entity [outside 4] 
K 

The residual signal which received from EM and corrected the information boiled and attached and 
by which the rise sampling was carried out 
[External Character 5] 
r m ( n ) 

The component SH for a shift to generate is included. As an example, rise sampling-ized gamma is 8, 
so, the signal by which the rise sampling was carried out has the frequency of 64kHz, and this rise 
sampling ratio gives suitable resolution to the purpose of all requests. Furthermore, for right 
actuation of an interpolation filter, it is required to always use the sample of a predetermined 
number following that interested, and this is the reason K more samples of the continuing frame 
are loaded also to MT. 

[0016] It is not necessary to carry out substantially that it is also at the sampling frequency of 
8-KHz about the down sampling which acquires the corrected residual signal, saying - the 
actuation - the need " responding " it is also at a suitable phase " a ratio " every gamma " 
[External Character 6] 
T m in) 
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By only reading a ** sample, it is because it performs tacitly. gammaLs sample of the residual 
signal with which the rise sampling of the component SH was carried out as a practical question, 
The sample of a certain settled number of degrees coordinated with the shift by which the max in a 
frame was permitted, and a front (in fact) As explained by the publication about the optimal shift 
search, it is the memory which loads a number of the maximum shift equal twice of samples to each 
reverse frame. And the component SH It is addressed for read-out by the error energy minimization 
unit EM so that Ls sample appropriately shifted to future circuits about the coming reverse frame 
may be supphed. Each has Ls sample to examine a renovation sign document, and this includes the 
word of the predetermined number in which some of them differ from 0 in it. Thinking that this 
selection can find out the word which has many pulses (namely, sample which is not zero) for which 
all pulses are actually suitable there since a sign document has constraint makes it possible to 
decrease the amount of required count, when it originates in the fact of being a phantasm and the 
optimal excitation is discovered. In the desirable example of this invention, the sign document 
consists of the two sections. The first section includes Ls language with the ampUtude equal to 1, 
the sample which is not the single zero which have a plus sign, and the sample of the zero of Ls*l. 
The sample which is not zero occupies a different location in all the words acquired one [ at a time ], 
when only one location only shifts the sample which is not zero. Signal S (n) is [Equation 11] to this 
first section of a sign document, 
s (n) =delta (n-nl) (5) 

It is expressed by carrying out, delta is a well-known unitary function by the upper formula, and it 
is n and nl. It can have a value between 0 and Ls-1. 

[0017] The 2nd section includes the word which has the sample which is two whose amplitude is 1, 
and the zero sample of Ls-2. Such a word leaves a number of keywords limited as it is also with the 
approach indicated by European Patent application EP-A -0396121 in a name called CSELT, and is 
generated. In the example currently taken into consideration all of three keywords The 1st pulse in 
a location 0, Each key position n2 (l), n2 (2), and n2 It has the 2nd pulse of (3)., and other WORD, 
i.e., word It is obtained in shifting a pulse pair towards the end of the word until the 2nd pulse 
reaches in the end of a word or the 1st pulse arrives at each key position. Only the 2nd pulse 
notation has two mutually different words, and they take the total of the word in a renovation sign 
document to Ls+2nickel2 (this example 62) as a key position is chosen in order to give the origin to 
the possible location of nickel2 (especially 21) of that pulse pair, and described by the 
above-mentioned Europe application to one each of such the locations, this 2nd section of a sign 
document - receiving - a renovation word - n= 0 .... Ls- 1 and nl =0 ..Ls-l-n2 (p) n2 =n2 (p) Ls- 1, 
p=l ..Nip, and n2 as the number of the key positions where (p) shows a general key position and Nip 
is used (this example 3) a degree type, i.e., " [Equation 12] 
s (n) =delta(n-nl) **delta (n n2) (6) 

It is expressed be alike. The renovation sign document structure with the sample and word which 
are not some zero obtained when a Hmited number of keys are left and only one location shifts a 
sample is the easy deterministic structure where storage of a sign document also makes possible 
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the quick search procedure of the optimal excitation which does not need effective **** of a 
candidate excitation signal, either. 

[0018] During the search to the optimal renovation, the trial by it being also in the word of the first 
section of a sign document must be performed, only when the time of long-term analysis displaying 
an utterance sound or energy concentration strong on the contrary is seen in the short signal 
section. Since such strong concentration can sign, the initiation, i.e., the standup, of the section 
uttered, the classification is based on long-term analysis still more and there are no features which 
are helpful for this standup being shown at the last signal section when actual, as they thinks best, 
they are classified. Therefore, a filter LTSl can supply a right preliminary announcement signal 
under such a condition, then, it comes out absolutely that a pitch pulse is reproduced correctly, and 
there is, and so, although unsuitable actuation (in utterance section) or the impossible corrective 
action (starting and coming out) of a long-term composition filter is compensated, it is useful [ the 
thing / use of a single pulse word ] for the signal with which good quality was encoded, in itself. It is 
not used for instead a single pulse word reproducing the sound which does not start and come out 
and which is not uttered, but since subjective effect is bad, even when it is actually one of them to 
give the minimum error signal energy, use of a single pulse word is usually an opposite effect. The 
approach by which the strong energy concentration in short time amount is detected will be 
described later. The word in a sign document is checked by each index j (s), and the index relevant 
to the optimal word encoded appropriately is transmitted to a decoder through Rhine 2d. In an 
example here, since indexes [ many ] j (s) supports those words including 62 words, a sign document 
can use two still more nearly another values of j (s) which does not support any words in the sign 
document, without correcting several j (s) of bit coding. If these are used for expressing the 
renovation gain of zero and are referred to as what is similarly made to long-term preliminary 
announcement delay and a multiplier so that it may be stated also later It is used for only one of the 
two values of j (s) which does not support a renovation word displaying g= 0, when generating such 
an index, and when decoding, g will **** in both the values of j (s), and it will be set as 0. 
[0019] This is quantized about Gain g using the sign document made so that it might permit saving 
a coding bit about a thing actually required expressing all the possible values given to the sign 
document. The information about gain over each reverse firame is expressed in the gestalt of two 
indexes j (gmax) and j (gnor), and the thing of the start of them is coordinated with the maximum of 
g in a firame, and the 2nd thing is coordinated with the difference between this maximum and an 
actual value with Notation sigma. This information is transmitted to a decoder through Rhine 2e. 
The sign document contains the possible absolute value of the Nig individual of g expressed as 
Nig=Nim+Nin -1 noting that Nin shows two different powers, Nim and2. Here, it is Nim=24, 
Nin=22 or Nim=24, and Nin=23. It can have. In each reverse frame, the optimum value of g 
determined that it is also by the error minimization procedure described later is quantized, and 
although not transmitted, each index j (g) reconfigurated with a decoder is generated. In the end of 
a fi:ame, the value j relevant to the greatest firame gain (gmax) is identified, and when it is below 
Nin, it is transmitted. Otherwise, Index j (gmax) is made into a value Nin. Thus, j (gmax) can take a 
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Nim value and, so, the number which is a coding bit is restricted. Supposing it identifies j (gmax), it 
is calculated to each reverse fi-ame that Index j (gnor) is also for formula j(gnor) =j(gmax)-j (g), and j 
(gnor) can once have a value in the range between 0 and Nim+Nin -2. The actual value of Index j 
(gnor) is transmitted only when not larger than Nin-1. When there is nothing as if, gain is ****(ed) 
with 0 (that is, renovation becomes silent to a reverse firame and the gain is very small about 
maximum), and index [ of a renovation word ] j (s) is made into one of the values which do not 
correspond to which [ which show transfer of the word of zero gain ] sign Thus, the bit which 

the converted differential dynamics were used and was used for expressing the gain in all dynamics 
is saved at the sacrifice of the slight engine-performance loss by the renovation silence which may 
take place. In order to make effect of the channel error about renovation index j (s) into min, in 
silence, value Nin* 1 to Index j (gnor) will be transmitted anyhow. 

[0020] a gain sign document ■- a logarithm ■- since it is a sign document, the ratio between two 
continuing values is fixed. This ratio is taking into consideration some requirements shown below. 
That is, the value in --dB must be near as much as possible, in order to make quantization as exact 
as possible. 

- The dynamics on the whole between minimum gain g (l) and the maximum gain g (Nim+Nin -l) 
must be appropriately extended so that the voice level fi-om which the sound of a different form and 
a suitable set differ may be covered. 

- The differential dynamics between g (x-Nim +l) (x) must be appropriately extended, in order to 
make possibiUty of silence low suitably. For example, in the case of the value which Nim and Nin 
mentioned above, the value of the ratio between two continuing gain level is continued for 6dB from 
3dB. Now, here describes the high-speed search procedure to the optimal shift and excitation with 
reference to the operation Fig. in drawing 2 corresponding to the set of the blocks Ml, LTS, and STS 
of Fig. 1, STS', SM and SW, and SW. The blocks STWl and STW2 by which the filter produced from 
a series of filters STSl and SW is expressed with drawing 2 , and transfer function l/l'Aw Except 
for block STS' which is a filter with (z), and SW', the same notation as the thing in drawing 1 is used, 
the component (LTSa, STWla, STW2a) which has the zero input whose each of a filter gives the 
contribution of initial condition (memory to a fi:ont reverse frame is ****(ed)) in this drawing " and 
it is divided to the component (STWlb, STW2b) reset in each reverse firame (it is **** at the initial 
condition of zero) as shown by the signal R supplied by the hourly base which is not shown. Since 
Delay d is assumed to be smaller than a reverse firame, **** of being also at the initial condition of 
the zero about excitation is only short-term ********. 

[0021] The decision of the optimal shift is evaluation of the need of performing the following three 

steps, i.e., >shift.; 

■ Decision of the suitable range about a shift value > 

" It consists of searches to the optimal shift in the range, the first step three conditions, i.e.,; 

- A reverse fi-ame is rs. It is not the silence the energy of (n) is indicated to be according to the fact of 
being larger than a predetermined threshold.; 

- A signal is uttered as shown by the flags F and V firom LTA, or it is interpolated.; 
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- rs It is actually generated in a reverse frame and the peak of (n) is rs in a reverse frame. The mean 
power (several [ of a sample ] energy broken by Ls) of (n) is larger than the energy in the period of 
die-length d finished as the sample of the last of the reverse frame itself, or is shown by the fact of 
being equal to it. It is confirmed whether to be that the conditions to say are fulfilled. The reason 
over the first conditions is clear. About the 2nd and 3rd conditions, a shift must be performed, only 
when a pitch peak is in a reverse frame. The fact that produced this in the utterance section in the 
1st first, and interpolation arose. That is, since the fact that the value of the parameter obtained 
with two continuing frames is very near has suggested the positive periodicity in the signal segment 
which must be encoded In this case, it is useful to enable that shift, although the irregular danger 
that it can set between the reconfigurated signal and the original signal is made still smaller. Count 
of energy and power is separately performed about a rise sampling signal or the original signal. 
[External Character 7] in the reverse frame present in such count 
? s 

A ****** absolute value and its location are also obtained, and it is used in case the shift is 
determined. In order to decide the location about maximum and to obtain the greatest resolution, it 
is absolute to make it operate about the signal by which the rise sampling was carried out, 
[0022] 

[External Character 8] 

Ttt4 1) mma^mmii.in . h... it. m^tzi.^i.^ximu^r 

czmm^n^) o z<D^^y^it. ?. <DM±^'^myi^-^i^^W^^^::^ti^ 

The optimal shift value in trial within the limits makes min energy of an error signal el (n) 
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expressed with the difference between the contributions ywl (n) of the excitation which **** the 
correcting signal xw (n) by which weighting was reconfigurated and carried out, ( drawing 1 ), and 
memory, and is acquired as it is also by the high-speed search procedure of decreasing required 
computational complexity. On the other hand, it is the output signal xw from STW to this 
high-speed search, (n) is [Equation 13]. 

^ ^ P 

Xw (n)=Fm (n)H-£ awi*Xw ( 

i = 1 



s 



(however, n " from 0 up to Ls-l) 

It must carry out and must take into consideration being expressed and that the same signal is the 
sum of the output xwl of STW2a, and the output xw2 of STW2b on another side. The sum in a 
formula (7) expresses the signal xwl calculated only at once like the contribution jrwl to which 
Chain LTSa and STWla correspond, so, the error specified as eO =xwl-ywl is also calculated only at 
once, and the result of the count appears in the output of Adder SMa. 
[External Character 9] 

[0023] the shift value to which the procedure of determining xw2 adopted according to this 
invention was given - receiving " a signal xw2 a degree type, i.e., - [Equation 14] 

m i A in* py 

Xw2(n)=rs (rn+h) + S awi-Xw (n-i) (8) 

i » 1 

It is taking that be alike is given into consideration. Since a sample with n-k<0, i.e., the sample of a 
front reverse frame, must not be taken into consideration on the occasion of**** that a peace upper 
limit is also at the initial condition of zero, it is the minimum value between n and p. 
[External Character 10] 
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r<D^m^^>y \-^MLX^ (8) e^:ti£>^Tit^ 

instead of calculating that a formula (8) is also - xw2 - a degree type, i.e.,; 
[Equation 15] 

Xw2 (n) = Xw2 (n — 1) +Q (n) 7s 
({HL^^ n = L s - 1 1 , X V.2 ( 0 ) = r s 

It is alike, and is followed and calculated. In the formula (9), Q (n) shows the pulse response of 
****** of Filter STW as Q(0) =1 (since it is calculated only to Ls value of n). When Q takes into 
consideration what it opts for only at once except for a certain value so that it may understand from 
here, a formula (9) needs count far fewer than a formula (8). 

[0024] When actual, gamma value of xw2 must be calculated according to a formula (8) and (9) to 
each of the signal sample by which the rise sampling of gamma corresponding to a 8kHz sampling 
period was carried out. Once it makes energy of el (n) into min and finds out the optimal shift, it 
can begin, in order that minimization of the energy of e (n) may find out the optimal excitation. Unit 
EM calculates direct the formula of the energy which is the function of the location of the pulse in a 
renovation word and which should be minimized, and pulse response Q is adopted for this purpose, 
and it is calculated during the search to the optimal shift. Count of a pulse response is conveniently 
made about activation of ******** according to the fact that the sample there are many each words 
and they are not [ sample ] two zero is included. Furthermore, consideration of the case of a word 
with two pulses of being more general obtains simply the response to the word of everything but all 
that are the sum of two responses and were coordinated with the key by which only distance with 
the pulse response [ on the whole ] equal to a key was ****(ed) by changing only one sample at once. 
For simplification, the adjustable range of the sum index to the total extended to all the samples in 
a reverse frame is not displayed by the following mathematical expressions, error e (n) to a general 
excitation word - u (n) " as the output signal from STWlb e(n) =el(n)- it is given by yw2(n) 
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=el(n)-gl and u (n). the energy of e (n) ■ a degree type, i.e., - [Equation 16] 

E(e) =sigmae2(n) =sigma [el(n)-gl and u (n)] 2 (lO) 

Be alike is given and it is E(e) =sigmael 2+2gl, sigmael and u+gl 2, and sigmau2. It can write by 
carrying out. If Display R (el u) is used in consideration of the sum of the beginning and the last 
expressing a signal el and the energy of u, and expressing cross-correlation [ between them by 
which the 2nd thing was evaluated to K= 0 ] R (el u) (K); 
[Equation 17] 

E(e)=E(el )-2gl R(el u)+gl 2 E(u) (11) 

****** 

[0025] making E (n) into min - the difference of energy, i.e., [Equation 18] 
deltaE=E(el)-E(e) =2gl R(el u)-gl 2 E (u) (12) 

It is the same as making it max. each word of the consulted sign document - receiving the max of 
a formula (12) -- gl gl which appears immediately by calculating the derivative by being related 
and setting it to 0 value gO = - it obtains to R (el u)/E (u) -- having - it - receiving - a degree type, 
i.e., [Equation 19] 

deltaEO =R(el u)2/E(u) =gO and R (el u) (13) 

It is ******(ing). The specific structure of a renovation sign document makes it possible to obtain 
direct E (u) and R (el u) depending on the location of the pulse beyond one or it in the word by using 
the pulse response of the filter STWl of the filter STW2 determined before equal to one. actually - 
E (u) - a degree t5rpe, i.e., [Equation 20] 

E(u) =sigma [Q(n-nl)**Q (n-n2)] 2 = sigmaQ2+(n-nl) sigmaQ2**(n-n2)2sigmaQ(n-nl) - Q (n-n2) 
It is if it simplifies although it becomes.? 
[Equation 21] 

E (u) =Eq(nl) +Eq(n2) **rho (nl and n2) (14) 

In a next door and an upper type, Eq is the energy (that is, calculated to many samples determined 
by the location of nl and n2) of the suitable slanting truncated signal Q. Furthermore, R (el u) is 
[Equation 22]. 

R(el u) =R [el q (nl)] **R [el q (n2)] (15) 

It is written by carrying out, is here, and is [Equation 23]. 

R Cei q (K) ] ^'''s'^ i(n + K) Q (n) 



It comes out. A formula (14) and (15) are expressed to a single pulse word in the gestalt E(u) =Eq 
(nl) and R(el u) =R [el q (nl)] so that clearly fi:om here. In order to determine the optimal 
excitation, the actuation performed by EM by each reverse frame is divided into three steps. 
[0026] a) It is a value ai before investigating the effect of each renovation word. As soon as it can use, 
EM calculates and memorizes the possible value of three addends in a formula (14). As stated above, 
at the following reverse fi:ame, it is a filter factor ai. Since it does not change, the count will be 
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performed only to the first four reverse firames. Term Eq - a degree type, i.e., :, - [Equation 24] 
Eq(Ls-l-n)=Eq(Ls-n)+Q2 (n) (16) 
(However, n= 1 .... Ls-1, Eq(Ls-l) =l) 

It is alike and it is calculated that it is also by the easy repeatability procedure which followed, 
furthermore, a sign document " a pair of possible value nl of nickel2, and n2 since it contains " 
count of rho *- a degree tjrpe, i.e., •, " [Equation 25] 
rhok=2Q [n2 (p)] 

rhok=2rhok4-l+2Q[n+n2 (p) ]-Q (n) 

It is alike, it follows and performs only to such a pair, and it is an upper type, has n2 (the semantics 
which already quoted p), i.e., n=l ....Ls-l-n2(p), and is k=nickel2.. 1 is a value nl and n2. It is a 
general pair. 

b) el As soon as it can use an optimum value, in advance of the search procedure, EM always 
calculates and memorizes a value R (el q). 

c) After such actuation and EM calculate one value of E (u) and R (el u) at a time, and are a value 
go. The word index and its related value of g which determined the related value deltaE and made 
the energy min are memorized. 

[0027] As stated above, the trial by it being also in the word of the 1st section of a sign document, if 
it becomes by which a sound is not uttered is performed only when concentration of the powerful 
energy in the short time amount which shows, initiation, i.e., the standup, of the utterance signal 
section, is accepted. For this purpose, only one sample is calculated at once by shifting the aperture 
which chooses that group until the energy of the sample of a certain group of the corrected residual 
signal leaves firom the start of that reverse frame within that reverse frame and all reverse frames 
are scanned (for example, five samples), and it memorizes which group shows the greatest energy. 
Furthermore, the mean power (that is, energy divided by the number of samples) in the aperture 
which produced maximum, and the mean power in a reverse firame are also calculated. The trial by 
it being also in a single pulse word is attained at them, when the ratio between the mean power in 
the aperture and the mean power in a reverse firame is larger than a suitable threshold in the 
energy of a reverse fi-ame, and a list. Furthermore, if the optimal renovation consists of the single 
pulse word, the absolute value of Gain g is maximum I g I max x= I rs I max. It is restricted, is a 
parameter equal to about 1 here, and is I rs | max. [External Character 11] 
h mi n i h na X 

It is the residual maximum to determine and which is calculated working. The purpose of this Umit 
is preventing invasion to that signal of the pulse which has too high energy about the greatest 
residual amplitude in that same reverse fi:ame. The early condition in Filter LTSa, STWla, and 
STW2a will be updated in the end of each reverse frame. LTSa, i.e., ss, In order to update (n), it is 
required to add one or one pair of pulses (for it to correspond to the optimal renovation word) to ss 1 
(n). yw In order to update (n), it is required to be appropriately shifted, in order to supply the value 
of yw2 corresponding to the optimal excitation, and to add one or two pulse responses (for it to 
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correspond to signal u (n)) by which Gain g was multiplied to ywl (n). The pulse response is used 
also in order to update STW2a. Furthermore, since Filter STW has Degree P, only P sample (from 
Ls up to Ls-P) of the last of this response is important. Actuation of EM is included also in the 
appendix. Now, the configuration of a decoder will be described by here with reference to drawing 3 
the block corresponding to what was already described with reference to drawing 1 is indicated to 
be by giving a figure 2 to the same reference mark. The signal with which various kinds were 
reconfigurated is also shown that the same reference mark used to the signal of the origin in an 
encoder is also. 

[0028] From an encoder, a decoder lets Rhine 2a-2e pass, and receives the notation sigma to Indexes 
j (phi) and j (d), j (b), j, j (gmax) and j (gnor), and renovation gain (s). In each reverse firame, index j 
(s) directs the reverse firame which chooses renovation word s (n) in the renovation document IC 2, 
or does not give a renovation contribution (g= O). Supposing a word is chosen, in M2, the absolute 
value will be chosen by index j(g) =j(gmax)-j (gnor) in the sign document IG 2, the gain g with the 
notation sigma will be multiplied, and this will give it the reconfigurated excitation signal (that is, 
sign document contribution of immobilization) (n) si. This signal is the reconfigurated short-term 
residual signal ss. In order to give (n), it **** in the long-term composition filter LTS2. In order to 
make it operate completely like the repUca LTSl in an encoder, a filter LTS2 must receive the flag F 
which indicates the needs of performing interpolation of d and b to be Parameters d and b from the 
reconstruction circuit LTR 2. So, LTR2 memorizes the value of d relevant to two continuing firames, 
and b. and to decide whether to be the thing which is the need [ to be interpolated about d and b ] 
The read only memory which has two tables addressed by index j (d) and j (b) like LTRl ( drawing 
1 ) out of the circuit suitable for performing the comparison described about the encoder is included. 
Signal ss left from LTS2 (n) is the multiplier ai which leaves Index j (phi) and is generated in the 
multiplier reconstruction circuit STR 2. It uses and **** in the short-term composition filter STS 2. 
Moreover, in STS2, the interpolated multipHer is used to the reverse firame of the beginning of each 
firame. reconfigurated sound signal y (n) *- still more - linearity preliminary announcement 
multiplier ai from " the multiplier obtained is used and it **** fiirther in the adaptability filter PF 
inserted in the sound signal which had distortion which improves the consciousness effectiveness 
reconfigurated. Reconstruction signal yp ****(ed) by the output of Filter PF (n) is taken out. 
Probably, much more explanation will be unnecessary, since it is common knowledge for an engineer 
to adopt a filter like PF when encoding a sound signal. A decoder should lay on heart not taking into 
consideration the shift performed with the encoder. The purpose of a shift is bringing the 
compounded signal close to the replica of the original signal as much as possible, therefore a 
decoder actually needs only excitation and the information relevant to a filter. 

[0029] The above publication is performed based on the un-restrictive example, and it is clear that it 
can accomplish without many modification and corrections deviating firom the range of this 
invention. For example, although the sample whose ampUtude is 1 is described when talking about 
renovation, it is also possible to use the sample as which the amphtude was chosen in the value (for 
example, **1, **root2, **l/root2) in the set of finite, and the signal encoded in this case includes the 
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information about the relative amplitude of a renovation sample. It is easy to generalize a formula 
(14) and (15) to the case of the pulse the amplitude of whose is not 1. Since the relative amplitude of 
the sample itself is quantized, selection of the sample amplitude in the value of a Umited set is not 
restrictive. Although the timing signal over various blocks is not shown for the simplification of a 
drawing, the timing sequence of operation is clear fi-om a publication there. 

[0030] ** It is written that it is also by notation which is [ in / the mode from which the amount in a 
program notation differs a little from the above-mentioned publication for the formal requirements 
for ****** C ] different. Since the difference about existence of an inferior letter, a parenthesis, etc. 
does not have not clear danger, it is not discussed in a detail, n, h, rs, Eq and Q, Relq, Nip, n2 (p), 
[ in / in notation n_ in a notation, h_ rs_ Eh, h_, Relh Nik, n2key, and id, ib and is / a specification ] 
Supported j (d), j (b), and j (s), and the alphabetic character "thr" added to a certain notation (Ers, 

Erf ) was discussed by the specification. The threshold over each amount is shown, EPSILON 

and RO are the ****** factors discussed also on the specifications, and DELTA shows (DELTAd 

supports d* in a specification)., the increment, i.e., the increment, in a value of each amount 

1) Long-term analysis /* Search to long-term preliminary announcement machine delay: */ Rrfdmax 

=-DBL„MAX; 

As opposed to (d_=Ls;d_<=D;d_++) { Rrf[dj=0.; 

(n=K; n<=Lf+H+K-l-d_; As opposed to n++) Rrf[dJ+= rftn+dj * rf [n]; 

If it becomes (Rrf[d J >Rrfdmax) { d[0]=d_; 

Rrfdmax =Rrf[dJ; 

} 

} 

[0031] 

/* 2nd order-search to the long-term preUminary announcement machine delay in the circumference 

of a fi-ont value: */ dmin =sround (l. EPSILON dthr) (*d [-1]); 

dmax =sround (l. +EPSILON dthr) (*d [ l]) : If it becomes (dmin <Ls) dmin =Ls; 

If it becomes (dmax >D) dmax =D; 

If it becomes (utterance [-1] && interpolation [1] && (d[0] < dmin | | d [0]> dmax)) { Prfd 
max_=-DBL_MAX; 

(d_=dmin ; d_<=dmax ; As opposed to d_++) If it becomes (RrfIdJ>Rrfd max_) { d_=d_; 

Rrfdmax_=Rrf [dj; 

} 

If it becomes (Rrfdmax_/Rrfdmax >=RORrf thr) d[0]=dj 
} 

I* Count of a long-term preliminary announcement machine multiplier and gain; */ 
Erf=Erf_=Erf_[0]=0.; 

if (if [ ] it l+d[ K ] [0] <=Lf+H+K-l-d [O] becomes - {■■ (n=K; n<=K-l+d[0]; n++) -- receiving - 

Erf+=rf[n] * rftn];) 

(; n<Lf+H+K-l-d [0]; n++) - receiving - Erf_+=rfln] * rf[n]; 



30 



(; n<=Lf+H+K -i; n++) -- receiving - Erf_[0]+=rfInl * rfin]; 

Erf+=ErfJ 

Erf_+=Erf_ [O]; 

} 

[0032] 

To others As opposed to { (n=K; n<=Lf+H+K- l-d[0]; n++) Erf+=rfln] * rf [n]; 

(; n<=K-l+d [0]; n++) -- receiving - Erf_[0]+=rf[n] * rf[n]; 

(; n<=Lf+H+K -i; n++) - receiving - Erf_+=rf[n] * rf[n]; 

Erf_[0]+=Erf_; 

} 

Erf_[0]+=r£[Lf+H+K-l-d[0]] * rf [Lf+H+ K-l-d [O]]; 
b[0]=(Erfi>=Erfthr) ?Rrf[d[0]]/ErfO.; 
G=(Erf>=Erfthr &&Erf_>=Erfthr J?l./(l. 

- b[0] * Rrf[d[0]]/Erf_:i.; 

I* Correction of a long-term preliminary announcement machine multiplier^ */ b max and 1 

=(Erf_[-l] >=Erfthr J ?pow (Erf_ [0]/Erf_ [-1], d(duplex) [0]/(2* L£)) : DBL_MAX; 

If it becomes (b[0] >b maxl) b[0] =b maxi; 

If it becomes (b[0] >b max2&&G<G thr) b[0] =b max2,' 

/* Zero clipping of a long-term preliminary announcement machine multiplier and quantization 

with maintenance of a zero valued */ If it becomes (b[0]>=bq[l]/2.) {[0033] 

ib=sq (Nlc, bd, b [O]); 

b[0]=bq[ib] 

} 

To others {b[0]=0.; 

ib=i; 

} 

I* Interpolation and decision of utterance: */ if -- (duplex) (abs (d[0]-d [-1]) " /" d - [ - one --]"<-- = 

- " EPSILONdthr -&-&-- b - [ - zero --]--> - zero . - & " & - b " [ - one - ] - > -- zero .) 
When it carries out, it is. Interpolation [O] =i; 

To others Interpolation [0] =0; 

If it becomes (G>=Gthr &&b[0l >=bthr) Utterance [O] =i; 
To others Utterance [O] =0; 

I* Alternative correction of long-term preliminary announcement machine delay: */ If it becomes 

Gnterpolation [O] &&! utterance [O]) { DELTAd=sround (/(GAMMA* U)) (duplex) (h_* d [O]); 

If it becomes (DELTAd<-absDELTAdmax) DELTAd=-absDELTAdmax ; 

If it becomes others (DELTAd>absDELTAdmax) DELTAd=absDELTAdmax ; 

If (d[0]+DELTAd>=Ls&&d[0]+DELTAd< =D&& (duplex) abs (d[0]-i-DELTAd-d[-l] /d[-l] 

<=EPSILONdthr) if it becomes d[0]-i-=DELTAd;) 

} 
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/* Long-term preliminary announcement machine delay or display of a zero multiplier: */ If (b 
[0]>0.] if it becomes id=d[0]- Ls+lO 
To others id=Nld; 

[0034] 2) In this part of the minimization notation of error energy, "bang" shows recognition of a 
standup. 

/* Preparation to a time amount shift and a renovation search-* */ If (interpolation (0] | I utterance 
[0]) if it becomes { Ers=Ers_=0.;) 

(n=o+Ls-d[0]+h; As opposed to n<=o+h-i;nH-H-) Ers-i-=rs[n] * rs [n]; 
(; n<=o+Ls+h -i; n++) -- receiving Ers_+=rs[n] * rs[n]; 
absrs max=-DBL_MAX; 

(n=GAMMA* o+hJn<=GAMMA*+(o+Ls) h_-l As opposed to ;n++) { absrs=fabs (rs_ [n]); 

If it becomes (absrs>absrs max) {[0035] 

np[0]=n; 

absrs max=absrs; 

} 

} 

np[0]-=GAMMA* o; 

If If it becomes { hmin =-GAMMA* H; (Ers_>=Ersthr &&d[0] * Ers_/ (Ls*) >(Ers+ErsJ =ROPrsthr) 
If it becomes (peak &&np[-l] <=h_-l&&hmin < np[-l]) hmin =np [-1]; 
hmax =GAMMA* H; 

If it becomes (hmax > np [O]) hmax =np [0]>* 
DELTAhpL=h_-hmin ; 
DELTAhpH=hmax -h_; 

If (DELTAhpL>=sround (GAMMA* DELTAhp)) 

& If it becomes &DELTAhpH>=sround (GAMMA* DELTAhp) { hmin =h_-sround (GAMMA* 
DELTAhp); 

hmax =h_+sround(GAMMA* DELTAhp); 
} 

To others If it becomes (DELTAhpL>=DELTAhpH) hmin = max (hmax-2* sround (GAMMA* 
DELTAhp) and hmin); 

To others hmax = min (hmin+2* sround (GAMMA* DELTAhp) and hmax); 

Peak = l; 

} 

To others { hmin =hmax =h_; 

Peak = O; 

} 

} 

[0036] 

others { - - if it becomes (h_<0) - hmin =hmax = min(h_+sround (GAMMA* DELTAhr), 0); 
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To others hmin =hmax = max (h.-sround (GAMMA* DELTAhr). 0); 
Peak = 0 } 

If it becomes (l<=Nis_-l) { h_[0] =Eh[Ls-l] =1.; 
(n=l,n_=Ls-2; n<=Ls-lJ n++,n_--) 
It is alike and receives. { h_[n]=0.; 

(k=lj* k<=min (n, P); As opposed to k++) h_[n]+=aw[l] [k] * h_ [n-k]; 

Eh[nJ =Eh[n.+l]+h_[n] * h_ [n]; 

} 

(- i=Nik and j=nickel2; i>=l; i - ) - receiving - { rolj] =2.* h_ - [ - n2 key [i]]; 
(n= 1, j "j n<=Ls-l-n2 key [i]0 

n++ and j - receiving - rolj]= - ro[j+l]+2.* h_[n+n2 key [i]] 

* h_ [n]; 

} 

} 

/* Search to the time amount shift of a short-term prehminary announcement residual signal: 
(n=o and n<=o+Ls-i; n++) It receives. {[0037] 
xw [n] =0.; 

(k=l; k<=P; As opposed to k++) xw [n]+=aw[l] [k] * xw [n-kl; 

el [n-o] =xw [n]-yw [n]; 

} 

Eelmin =DBL_MAX; 

(i=0,h_=hmax ; i<=GAMMA-l&&h_>=hmin ; 
i++ and h_ - receiving { Eel=O.J 
(n=0,n_=GAMMA* o+h_; n<=Ls-i; 

As opposed to n++ and n_+=GAMMA { xw 2[i] [n] =rs_ [nj; 

(k=l; k<=min (n, P); As opposed to k++) xw 2[i] [n]+=aw[l] [k] * xw 2 [[i] n k]; 

el_=el[n]+xw 2 [[i] n]; 

Eel+=eL* el_; 

} 

If it becomes (EeKEelmin) { h_=h_,* 

Eelmin =Eel; 

} 

} 

(i=0,n_=GAMMA* o+h_; h_>=hmin ; 

i=(i<GAMMA -l) ?i+l:0 and h_ - and n_ - receiving - { Eel=0.; 
(-- n=Ls -IJ n>=l; n -■) receiving -- {[0038] 
xw 2[i] [n] =xw 2[i] [n _[ l]+h] [n] * rs_ [nJ; 
el_=el[n]+xw 2 [[i] n]; 
EeH-=el_* el_; 
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} 

xw 2[i][0]=rs_ [nj; 
el_=el[0]+xw2[[i]0]; 
Eel-H=el_* el_; 

If it becomes (Eel<Eelmin) { h_=h_; 

Eelmin =Eei; 

} 

} 

h=sround (duplex) (h_/GAMMA); 

/* count of the long-term component of the error by which weighting was carried out to the voice list 
by which weighting was carried out - Preparation:*/to a ****** search (n=o, n_=GAMMA* o+hj 
n<=o+Ls- ll) 

n++ As opposed to n_+=GAMMA { xw [nl =rs_ [n]; 

(k=l; k<=P; As opposed to k++) xw [n]+=aw[l] [kl * xw [n-k]; 

el [n-o] =xw [n]-yw [n]; 

sqrs[n-o] =rs_[n_] * rs_ [n_J; 

} 

(k=o; k<=Ls-i; As opposed to k++) { Relh[k] =el [k]; 

(n=l; n<=Ls-l-k; As opposed to n++) Relh[k]H-=el [n+k] * h_ [n]; 

} 

[0039] 

/* Alternative renovation gain clipping and preparation to an alternative renovation search- */ If it 
becomes (utterance [O]) absgl max=nu* absrs max; 
To others { Ers=0.; 

(n=OErs-i-=sqrs; n<=B-l (n];); As opposed to n++) 
Ers max=Ers_=Ers; 

0 n<=Ls-i; n++) - receiving - { Ers+=sqrs[n]; 

Ers_+=sqrs[n]-sqrs [n-B]; 

If it becomes (Ers_>Ers max) Ers max=Ers__J 

} 

If it becomes (Ers>=Ers thr_&&Ls* Ers max/ (B* Ers) >=ROPrs thrj bang=l; 

To others bang=0; 

} 

[0040] 

/* Alternative search to the renovation parameter accompanied by alternative renovation gain 

cUpping : */ nlQ] =n2[ll =0; 

gl[l]=g2[l]=0.; 

DELTAEele max= DBL_MAX; 

If it becomes (utterance [O] I I bang) (nl_=0, i=i;nl_<=Ls-i;nl++, i++) 
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It is alike and receives. { Eu=Eh [nljj 
Rel u=Relh [nlj; 
gl_=Relu/Eu; 
DELTAEele=gl_* Relu; 

If it becomes (DELTAEele>DELTAEele max) { nl[l]=nl_; 

is[l]=i; 

gl[l]=gl_; 

DELTAEele max=DELTAEele; 

} 

} 

To others i=Ls +l; 

If (glU <-absgl max) If it becomes glM =-absgl max; (utterance [O]) 

If it becomes others (gl[l] >absgl max) gl[l] =absgl max; 

(j=l, k= 1 (nl_=0, n2_=n2 [key j];n2_<=Ls-i;); j<=Nik; As opposed to j++) 

nl_++ n2_++ i++ As opposed to k++ { Eu=Eh[nlJ+Eh[n2j-ro [k]; 

Rel u=Relh[nlJ-Relh [n2 J; 

gl_=Relu/Eu; 

DELTAEeIe=gl_* Relu; 

[0041] 

If it becomes (DELTAEeloDELTAEele max) { nl[l]=nl_; 

n2[l]=n2_; 

is[l]=i; 

gi[l]=gi_; 
g2[i] =-gi_; 

DELTAEele max=DELTAEele; 
} 

i++; 

Eu=Eh[nlJ+Eh[n2 J+ro [k]; 
Relu=Relh[nlJ+Relh [n2j; 
gl_=Relu/Eu; 
DELTAEele=gl_* Relu; 

If it becomes (DELTAEele>DELTAEele max) { nl[I]=nl_; 

n2[l]=n2_; 

is[l]=i; 

gi[l]=gi_; 
g2[l]=gi_; 

DELTAEele max=DELTAEele; 
} 
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} 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing l] The block diagram of an encoder 

[Drawing 2] The functional diagram of the block with an encoder 

[Drawing 33 The block diagram of a decoder 

* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2. **** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 
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